AITopics | safe reinforcement learning

8f75af4704feac629a560f4ad6b67cef-Paper-Conference.pdf

Neural Information Processing SystemsNov-19-2025, 21:11:25 GMT

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Appendices

Neural Information Processing SystemsNov-16-2025, 18:46:05 GMT

Note that this safe RL problem is less general than the standard formulation of safe RL. The authors introduce a teacher-student hierarchy. To learn the teacher's policy the following constraints are followed: a1 The unsafe set is contained in the intervention set D D The teacher learns when to intervene and to switch between different interventions. A1.2 RL with probability one constraints We have introduced the safety state to the environment as follows: s First, we discuss our design for the PI controller and discuss the necessary parts for it. The proportional part delivers brute force control by having a large control magnitude for large errors, but it is not effective if the instantaneous error values become small.

ablation, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Add feedback

ProSh: Probabilistic Shielding for Model-free Reinforcement Learning

Court, Edwin Hamel-De le, Ohlmann, Gaspard, Belardinelli, Francesco

arXiv.org Artificial IntelligenceOct-22-2025

Safety is a major concern in reinforcement learning (RL): we aim at developing RL systems that not only perform optimally, but are also safe to deploy by providing formal guarantees about their safety. To this end, we introduce Probabilistic Shielding via Risk Augmentation (ProSh), a model-free algorithm for safe reinforcement learning under cost constraints. ProSh augments the Constrained MDP state space with a risk budget and enforces safety by applying a shield to the agent's policy distribution using a learned cost critic. The shield ensures that all sampled actions remain safe in expectation. We also show that optimality is preserved when the environment is deterministic. Since ProSh is model-free, safety during training depends on the knowledge we have acquired about the environment. We provide a tight upper-bound on the cost in expectation, depending only on the backup-critic accuracy, that is always satisfied during training. Under mild, practically achievable assumptions, ProSh guarantees safety even at training time, as shown in the experiments.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2510.1572

Country:

Europe > United Kingdom > England > Greater London > London (0.40)
North America > United States > Maryland > Baltimore (0.14)
Europe > Austria > Vienna (0.14)
(11 more...)

Genre: Research Report (0.64)

Industry:

Education > Educational Setting > Higher Education (0.40)
Leisure & Entertainment > Sports (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Learning to Undo: Rollback-Augmented Reinforcement Learning with Reversibility Signals

Sorstkins, Andrejs, Tariq, Omer, Bilal, Muhammad

arXiv.org Artificial IntelligenceOct-17-2025

This paper proposes a reversible learning framework to improve the robustness and efficiency of value based Reinforcement Learning agents, addressing vulnerability to value overestimation and instability in partially irreversible environments. The framework has two complementary core mechanisms: an empirically derived transition reversibility measure called Phi of s and a, and a selective state rollback operation. We introduce an online per state action estimator called Phi that quantifies the likelihood of returning to a prior state within a fixed horizon K. This measure is used to adjust the penalty term during temporal difference updates dynamically, integrating reversibility awareness directly into the value function. The system also includes a selective rollback operator. When an action yields an expected return markedly lower than its instantaneous estimated value and violates a predefined threshold, the agent is penalized and returns to the preceding state rather than progressing. This interrupts sub optimal high risk trajectories and avoids catastrophic steps. By combining reversibility aware evaluation with targeted rollback, the method improves safety, performance, and stability. In the CliffWalking v0 domain, the framework reduced catastrophic falls by over 99.8 percent and yielded a 55 percent increase in mean episode return. In the Taxi v3 domain, it suppressed illegal actions by greater than or equal to 99.9 percent and achieved a 65.7 percent improvement in cumulative reward, while also sharply reducing reward variance in both environments. Ablation studies confirm that the rollback mechanism is the critical component underlying these safety and performance gains, marking a robust step toward safe and reliable sequential decision making.

machine learning, reinforcement learning, rollback, (17 more...)

arXiv.org Artificial Intelligence

2510.14503

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.83)

Industry:

Transportation > Passenger (0.48)
Transportation > Ground > Road (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 09:22:28 GMT

Offline safe reinforcement learning (RL) aims to train a policy that satisfies constraints using a pre-collected dataset.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Appendices

Neural Information Processing SystemsAug-19-2025, 11:55:31 GMT

Note that this safe RL problem is less general than the standard formulation of safe RL. The authors introduce a teacher-student hierarchy. To learn the teacher's policy the following constraints are followed: a1 The unsafe set is contained in the intervention set D D The teacher learns when to intervene and to switch between different interventions. A1.2 RL with probability one constraints We have introduced the safety state to the environment as follows: s First, we discuss our design for the PI controller and discuss the necessary parts for it. The proportional part delivers brute force control by having a large control magnitude for large errors, but it is not effective if the instantaneous error values become small.

ablation, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Add feedback

Safe Planning and Policy Optimization via World Model Learning

Latyshev, Artem, Gorbov, Gregory, Panov, Aleksandr I.

arXiv.org Artificial IntelligenceJun-6-2025

Reinforcement Learning (RL) applications in real-world scenarios must prioritize safety and reliability, which impose strict constraints on agent behavior. Model-based RL leverages predictive world models for action planning and policy optimization, but inherent model inaccuracies can lead to catastrophic failures in safety-critical settings. We propose a novel model-based RL framework that jointly optimizes task performance and safety. To address world model errors, our method incorporates an adaptive mechanism that dynamically switches between model-based planning and direct policy execution. We resolve the objective mismatch problem of traditional model-based approaches using an implicit world model. Furthermore, our framework employs dynamic safety thresholds that adapt to the agent's evolving capabilities, consistently selecting actions that surpass safe policy suggestions in both performance and safety. Experiments demonstrate significant improvements over non-adaptive methods, showing that our approach optimizes safety and performance simultaneously rather than merely meeting minimum safety requirements. The proposed framework achieves robust performance on diverse safety-critical continuous control tasks, outperforming existing methods.

machine learning, reinforcement learning, world model, (17 more...)

arXiv.org Artificial Intelligence

2506.04828

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)
North America > United States (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning

Neural Information Processing SystemsMay-27-2025, 19:48:25 GMT

Safe reinforcement learning (RL) requires the agent to finish a given task while obeying specific constraints. Giving constraints in natural language form has great potential for practical scenarios due to its flexible transfer capability and accessibility. Previous safe RL methods with natural language constraints typically need to design cost functions manually for each constraint, which requires domain expertise and lacks flexibility. In this paper, we harness the dual role of text in this task, using it not only to provide constraint but also as a training signal. We introduce the Trajectory-level Textual Constraints Translator (TTCT) to replace the manually designed cost function.

machine learning, natural language, reinforcement learning, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation

Neural Information Processing SystemsMay-26-2025, 18:13:12 GMT

Safe reinforcement learning (RL) is crucial for deploying RL agents in real-world applications, as it aims to maximize long-term rewards while satisfying safety constraints. However, safe RL often suffers from sample inefficiency, requiring extensive interactions with the environment to learn a safe policy. We propose Efficient Safe Policy Optimization (ESPO), a novel approach that enhances the efficiency of safe RL through sample manipulation. ESPO employs an optimization framework with three modes: maximizing rewards, minimizing costs, and balancing the trade-off between the two. By dynamically adjusting the sampling process based on the observed conflict between reward and safety gradients, ESPO theoretically guarantees convergence, optimization stability, and improved sample complexity bounds.

artificial intelligence, machine learning, safe reinforcement learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reviews: Constrained Cross-Entropy Method for Safe Reinforcement Learning

Neural Information Processing SystemsMay-26-2025, 05:38:51 GMT

This paper studies constrained optimal control, where the goal is to produce a policy that maximizes an objective function subject to a constraint. The authors provide great motivation for this setting, explaining why the constraint cannot simply be included as a large negative reward. They detail challenges in solving this problem, especially if the initial policy does not satisfy the constraint. They also note a clever extension of their method, where they use the constraint to define the objective, by setting the constraint to indicate whether the task is solved. Their algorithm builds upon CEM: at each iteration, if there are no feasible policies, they maximize the constraint function for the policies with the largest objective; otherwise, they maximize the objective function for feasible policies.

constraint, machine learning, reinforcement learning, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Collaborating Authors

safe reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

8f75af4704feac629a560f4ad6b67cef-Paper-Conference.pdf

Appendices

ProSh: Probabilistic Shielding for Model-free Reinforcement Learning

Learning to Undo: Rollback-Augmented Reinforcement Learning with Reversibility Signals

OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning

Appendices

Safe Planning and Policy Optimization via World Model Learning

From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning

Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation

Reviews: Constrained Cross-Entropy Method for Safe Reinforcement Learning